69 research outputs found
Multimedia Semantic Integrity Assessment Using Joint Embedding Of Images And Text
Real world multimedia data is often composed of multiple modalities such as
an image or a video with associated text (e.g. captions, user comments, etc.)
and metadata. Such multimodal data packages are prone to manipulations, where a
subset of these modalities can be altered to misrepresent or repurpose data
packages, with possible malicious intent. It is, therefore, important to
develop methods to assess or verify the integrity of these multimedia packages.
Using computer vision and natural language processing methods to directly
compare the image (or video) and the associated caption to verify the integrity
of a media package is only possible for a limited set of objects and scenes. In
this paper, we present a novel deep learning-based approach for assessing the
semantic integrity of multimedia packages containing images and captions, using
a reference set of multimedia packages. We construct a joint embedding of
images and captions with deep multimodal representation learning on the
reference dataset in a framework that also provides image-caption consistency
scores (ICCSs). The integrity of query media packages is assessed as the
inlierness of the query ICCSs with respect to the reference dataset. We present
the MultimodAl Information Manipulation dataset (MAIM), a new dataset of media
packages from Flickr, which we make available to the research community. We use
both the newly created dataset as well as Flickr30K and MS COCO datasets to
quantitatively evaluate our proposed approach. The reference dataset does not
contain unmanipulated versions of tampered query packages. Our method is able
to achieve F1 scores of 0.75, 0.89 and 0.94 on MAIM, Flickr30K and MS COCO,
respectively, for detecting semantically incoherent media packages.Comment: *Ayush Jaiswal and Ekraam Sabir contributed equally to the work in
this pape
Modeling Heterogeneous Statistical Patterns in High-dimensional Data by Adversarial Distributions: An Unsupervised Generative Framework
Since the label collecting is prohibitive and time-consuming, unsupervised
methods are preferred in applications such as fraud detection. Meanwhile, such
applications usually require modeling the intrinsic clusters in
high-dimensional data, which usually displays heterogeneous statistical
patterns as the patterns of different clusters may appear in different
dimensions. Existing methods propose to model the data clusters on selected
dimensions, yet globally omitting any dimension may damage the pattern of
certain clusters. To address the above issues, we propose a novel unsupervised
generative framework called FIRD, which utilizes adversarial distributions to
fit and disentangle the heterogeneous statistical patterns. When applying to
discrete spaces, FIRD effectively distinguishes the synchronized fraudsters
from normal users. Besides, FIRD also provides superior performance on anomaly
detection datasets compared with SOTA anomaly detection methods (over 5%
average AUC improvement). The significant experiment results on various
datasets verify that the proposed method can better model the heterogeneous
statistical patterns in high-dimensional data and benefit downstream
applications
Security in Process: Detecting Attacks in Industrial Process Data
Due to the fourth industrial revolution, industrial applications make use of
the progress in communication and embedded devices. This allows industrial
users to increase efficiency and manageability while reducing cost and effort.
Furthermore, the fourth industrial revolution, creating the so-called Industry
4.0, opens a variety of novel use and business cases in the industrial
environment. However, this progress comes at the cost of an enlarged attack
surface of industrial companies. Operational networks that have previously been
phyiscally separated from public networks are now connected in order to make
use of new communication capabilites. This motivates the need for industrial
intrusion detection solutions that are compatible to the long-term operation
machines in industry as well as the heterogeneous and fast-changing networks.
In this work, process data is analysed. The data is created and monitored on
real-world hardware. After a set up phase, attacks are introduced into the
systems that influence the process behaviour. A time series-based anomaly
detection approach, the Matrix Profiles, are adapted to the specific needs and
applied to the intrusion detection. The results indicate an applicability of
these methods to detect attacks in the process behaviour. Furthermore, they are
easily integrated into existing process environments. Additionally, one-class
classifiers One-Class Support Vector Machines and Isolation Forest are applied
to the data without a notion of timing. While Matrix Profiles perform well in
terms of creating and visualising results, the one-class classifiers perform
poorly
Mid-infrared plasmons in scaled graphene nanostructures
Plasmonics takes advantage of the collective response of electrons to
electromagnetic waves, enabling dramatic scaling of optical devices beyond the
diffraction limit. Here, we demonstrate the mid-infrared (4 to 15 microns)
plasmons in deeply scaled graphene nanostructures down to 50 nm, more than 100
times smaller than the on-resonance light wavelength in free space. We reveal,
for the first time, the crucial damping channels of graphene plasmons via its
intrinsic optical phonons and scattering from the edges. A plasmon lifetime of
20 femto-seconds and smaller is observed, when damping through the emission of
an optical phonon is allowed. Furthermore, the surface polar phonons in SiO2
substrate underneath the graphene nanostructures lead to a significantly
modified plasmon dispersion and damping, in contrast to a non-polar
diamond-like-carbon (DLC) substrate. Much reduced damping is realized when the
plasmon resonance frequencies are close to the polar phonon frequencies. Our
study paves the way for applications of graphene in plasmonic waveguides,
modulators and detectors in an unprecedentedly broad wavelength range from
sub-terahertz to mid-infrared.Comment: submitte
A Novel Heat Shock Transcription Factor Family in <i>Entamoeba histolytica</i>
The HSTF is a master molecule involved in the transcriptional control of several genes during different types of stress. This transcription factor is a very conserved protein identified in different organisms from bacterial to human. <i>Entamoeba histolytica</i> is the protozoan responsible for the human amoebiasis. This parasite is exposed to different kind of stress as changes in the pH, temperature, drugs, all that situations in where the parasite needs survive. Here we identified and isolated a novel gene family of HSTFs in the protozoan parasite <i>E. histolytica</i>. Three members that we called <i>Ehhstf1</i>, <i>Ehhstf2</i> and <i>Ehhstf3</i> compose this family. Amino acid alignments and domain architecture analysis revealed that the EhHSTFs presents a conserved DNA-binding domain composed of approximately 25 residues. Interestingly this domain is shorter than the domain of the human, mouse and yeast HSTFs. Heterologous antibodies recognized four peptides of 73, 66, 47 and 23 kDa in total extracts from trophozoites growth under normal conditions. The 73, 47 and 23 kDa peptides increased their intensity when the cells were growth at 42°C by 2 h. All results together demonstrate that the amoeba present HSTFs, which may be, controlled the gene expression of this parasite under different stress situations
Neuromatch Academy: a 3-week, online summer school in computational neuroscience
Neuromatch Academy (https://academy.neuromatch.io; (van Viegen et al., 2021)) was designed as an online summer school to cover the basics of computational neuroscience in three weeks. The materials cover dominant and emerging computational neuroscience tools, how they complement one another, and specifically focus on how they can help us to better understand how the brain functions. An original component of the materials is its focus on modeling choices, i.e. how do we choose the right approach, how do we build models, and how can we evaluate models to determine if they provide real (meaningful) insight. This meta-modeling component of the instructional materials asks what questions can be answered by different techniques, and how to apply them meaningfully to get insight about brain function
Finishing the euchromatic sequence of the human genome
The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead
- …